Overview

Analysing daily mixes and discover weekly playlists

As an active Spotify user I often contemplate about maintaining an enormous playlist of songs that I like. However, in my opinion, if you really like a song you’ll remember it, so there is no need for a playlist to remind you of what songs you like. That being said, I do need some type of playlist to get my day started. Therefore I listen to my daily mixes almost exclusively. Occasionally I listen to my discover weekly playlist that always surprises me with its quality of recommended songs. Somehow these playlists seem fresh every time while at the same time containing a lot of songs that I know and love.

This made me wonder how these playlists vary from day to day and week to week and how my discover weekly relates to my daily mixes.

Therefore I will be analyzing my daily mixes and discover weekly playlists for the duration of 7 weeks. The interest lies mostly on the relation between the daily mixes and the discover weekly playlist. There is two things I hope to discover:

  1. Is there a way to predict my discover weekly playlist of the next week based on my daily mixes of the current week.

  2. Is there a way to predict which daily mixes I listened to in the previous week(s) based on my discover weekly playlist.

My daily mixes usually range from playlists containing Pixies and the Velvet Underground to playlists containing Miles Davis and Charlie Parker to playlists containing Eagles of death metal and the Black keys to playlists containing Canned heat and Roy Buchannan. Sometimes my daily mixes vary a lot with regard to each other, causing my discover weekly playlist to be a total mess sometimes. This will be very interesting to analyze.

spotify user id of me: kmkov4v2xhms7od6p3gq32wfv.

Below You can see a table of all the tracks that occur in my corpus. ordered in descending frequency of occurence.

Analysis of the weekly playlists

Valence, energy, major/minor and tempo


In terms of energy/valence the discover weeklies are distributed fairly similar. Some deviations are for instance week 4 and week 6, that seems to have its energy more concentrated in the center. Week 6 also seems to be more optimistic in general. All playlists consist of mostly major songs which, honestly, was to my surprise. Something interesting in weekly 2 is that the songs with highest valence and highest energy are songs in a minor key. Also week 7 shows something remarkable. The happiest most energetic songs here seem to be in the minor keys whilst the songs with lower energy and lower valence seem to be in a major key.

Change of energy in songs in the discover weekly playlists


In this violin graph you can clearly see that Discover Weekly 4 and 6 have their energy more concentrated whilst the others have it more evenly distributed.

Keys


It’s interesting to see how my discover weekly playlists seem to start with more peaks regarding the keys they are in. In the third week the peaks have sort of disappeared, then week 4 is mostly in B minor and G major. After this it seems to be relatively evenly distributed for the remaining weeks. D minor and D# minor appear to occur the least amount of times.

Tempo


Some thing I like to be seeing is that the third week’s discover weekly playlists seem to have an almost gaussian distribution around 125 bpm. This is nice because it coincides with the paper by Moelants and Dirk about prefered tempo.

However, week 6 and 7 show something quite different, being much slower and centered around 100 bpm, which was the idea of preferred tempo before this paper.

Loudness


In terms of loudness it remains fairly constant across the weeks with one clear outlier being week 3 having a peak at around -7.5 Db.

Daily v Weekly

Column

Valence and Energy 1

Valence and Energy 2

Valence and Energy 3

Valence and Energy 4

Valence and Energy 5

Valence and Energy 6

Valence and Energy 7

Column

Some Explanation

Here you can see energy and valence plotted against eachother for each week of playlists. You can see the daily mixes and filter on the date for the daily mixes. Furthermore the discover weekly playlist for that week is always shown.

What I find interesting to see is that the mean of the discover weekly playlists is often in the middle of the means of the daily mixes. So you could say that the discover weekly is right in the middle of everything.

This means that a discover weekly playlist can be interpreted as a linear combination of Daily mixes, this we can use for estimation, but more on that later.

However, do note for instance the discover weekly playlist of week 4. It is far more in the top right than the daily mixes.

All in all, the playlists are distributed fairly similar so it is likely that we will be needing more features for our goal.

Chroma Features

Chromagraphs for two versions of Paranoid android

Radiohead


Chromagrams are useful graphs that show how chroma is distributed over the song. These chroma features are useful when we are thinking in terms of pitch and key.

Here you can see a chromagram of two songs of which one is an instrumental cover of the other (Radiohead made the original).


Same song. Drastically differently performed. You can see that both versions have a similar structure regarding the chord changes but the changes are at different moments. Also Brad Mehldau really emphasizes one chord whereas the Radiohead version has the chroma magnitude more spread out at each time. A reason for this is probably that the Brad Mehldau version is mostly piano. Also note that the original (Radiohead) version ends the song mostly in A with some magnitude in D E and F as well. Brad Mehldau takes this and turns it around, putting a lot of the energy in D E and F and less in A.

Brad Mehldau

Trying to find a dynamic warp path for these two versions is quite hard


Due to the completely different timings in the song and generally different approach to the song, they are completely different songs according to Dynamic Time Warping. Let’s take a look at another example.

DTW path is even less observable now


Even though when you listen to both songs you can immediately link them (probably due to the lyrics), they are nothing alike due to DTW.

Still, when I listen to these two covers I feel that they are more similar than the two “Paranoid Android” versions which does contain some clearer warping paths (upper right area)

From this we can conclude that finding similarity between songs is not as easy as you may think and that just knowing what the pitches are will not be enough. Apparently some cover songs that are completely different in instruments can be more similar than two covers that share the instruments and lyrics but are performed quite differently. But of course, we haven’t accounted for timbre features yet.

(Side note if you don’t know the Roy Buchanan version and you like the Jimi Hendrix version really give Roy’s version a listen)

Chroma and Timbre analyses for two completely different songs

Miss Alissa by Eagles of Death metal and Heather by Billy Cobham

So I took a look in terms of valence and energy for two songs that should be polar opposites according to valence and energy and boy is that right.

In the lower left corner we have, with a valence of 0.041, an energy rating of 0.0079 and a length of 8 minutes and 39 seconds, the smooth and shockingly calming “Heather” by Billy Cobham.

In the opposing upper right corner we have, with a valence of 0.877, energy rating of 0.992 and a length of 2 minutes and 38 seconds, you can’t stop dancing to this one, “Miss Alissa” by Eagles of Death Metal.

I analysed differences in chromagrams, cepstograms and self similarity matrices.

Chroma comparisson


Miss Alissa seems to be all over the place in terms of chroma, whilst Heather is more organized and stays close to F for the whole duration of the song. Of course Heather is a lot longer so it leaves a lot more room for organization and build up.

MFCC comparisson


Miss Alissa seems to mostly concentrated in the first cepstral coefficients whilst Heather is concentrated more in the second and third.

First cepstral coefficient referring to loudness seems appropriate because Miss Alissa is quite loud and Heather is quite quiet.

Around 220 seconds the saxophone joins the song Heather which is observable in the cepstogram due to the distribution of magnitude getting more spread out.

Self similarity matrices Timbre


The self similarity matrices in terms of timbre show that Miss Alissa is fairly constant in terms of timbre whilst Heather shows a big change around 220 seconds which is because of the saxophone taking the lead.

Self Similarity matrices Pitch


Now it’s funny to see that the chromagram of Miss Alissa is complete chaos but the self similarity matrix is fairly constant and looks like one block implying that it might be quite chaotic but the chaos is very constant, which is actually a pretty good dexcription for the song.

On the other hand there is Heather that has a very organized chromagram that looks fairly constant but the self similarity matrix in terms of pitches looks a lot more complex than that of Miss Alissa.

Clustering comparisson to daily mixes

Clustering


I thought it might be fun to see if I could cluster a day of songs into 6 clusters which resemble daily mixes. For this reason hierarchical clustering did not seem like the best option for it will be hard to pick 6 clusters with hierarchical clustering because it always branches out in 2. Meaning that I’d have to pick a depth with either 4 clusters or 8 clusters.

Therefore I have used k-means with k=6 to find 6 clusters.

I used energy, key, loudness, mode, speechiness, acousticness, instrumentalness, liveness, valence, tempo, time_signature, track.duration_ms and the MFCC’s as features for clustering.

Here you can see a graph of the energy and valence of the daily mixes (top) and the clusters (bottom). It is fairly apparent that valence and energy do not have a lot of say in distinguishing both daily mixes and k-means clusters.

Let’s find the important features.


Training a classifier to determine cluster important features

Classifier Precision and Recall


So I did not know if there was a more efficient way to do this but this seemed suitable. Training a random forest classifier to classify the clusters aquired by k-means clustering will show which features are most important when determining what should be in which cluster.

The precisions seem pretty okay and the importance shows that “mode”,“instrumentalness” and “energy” are very important when assigning songs to clusters.

Let’s take a look.


class precision recall
1 0.9148936 0.9772727
2 1.0000000 0.7142857
3 0.9047619 0.7600000
4 0.8409091 0.8222222
5 0.9047619 1.0000000
6 0.9722222 0.9459459

Important features

Important features for clusters

Energy and instrumentalness

Mode


Interesting to see that mode completely seperates two groups of clusters while the actual playlists are distributed equally among the modes. This discriminative power of “mode” is probably due to its binary nature.


Estimating listen History from discover weeklies

Basic Idea

So now that we have an idea about what features we can extract and analyze using the spotify API, let’s return to the research questions:

  1. Is there a way to predict my discover weekly playlist of the next week based on my daily mixes of the current week.

  2. Is there a way to predict which daily mixes I listened to in the previous week(s) based on my discover weekly playlist.

Starting off, estimating my next discover weekly based on my daily mixes is going to be difficult. This will require me to compare the daily mixes to every song on spotify to find a fitting match for a discover weekly playlist. This is something spotify does very well and I doubt I will be able to recreate this wonderful algorithm for this project.

However, determining what daily mixes I listened to the most should be doable. Suppose we look at a discover weekly playlist and the daily mixes of the previous week. We can then write this down as an equation where each daily mix has a weight to determine how much I listened to it:

\(DW = DM_A^1 * W_1 + DM_B^1 * W_2 +...+ DM_F^7*W_{42}\)

Where each playlist is actually a set of features. Now an optimal solution where $ DW $ is estimated exactly likely does not exist. But this can be interpreted as a linear regression problem which we can solve.

Feature space

feautures on track level


Below you can see a table consisting of one week of daily mixes and a discover weekly playlist. Every feature has been z-normalized to ensure that all features are treated as equally important. In this manner every song can be interpreted as a set of features and their values.

features on playlist level


However, we are interested in the way playlist listen history and not track listen history. For simplicity we can take the mean of each feature for every playlist, resulting in the following feature space:

Least squares fitting

Regular Least squares


We are interested in the equation:

\(DW = DM_A^1 * W_1 + DM_B^1 * W_2 +...+ DM_F^7*W_{42}\)

which, from the table, reduces to:

\[\begin{bmatrix} energy_{DW} \\ loudness_{DW} \\ ... \\ B_{DW} \end{bmatrix} =\begin{bmatrix} energy_{DM_A^1} \\ loudness_{DM_A^1} \\ ... \\ B_{DM_A^1} \end{bmatrix} *W_1 + .... + \begin{bmatrix} energy_{DM_F^7} \\ loudness_{DM_F^7} \\ ... \\ B_{DM_F^7} \end{bmatrix} * W_{42} \]

Now without getting to deep into linear algebra, because there most likely is no optimal solution we need to perform a least squares estimation. For this we need to reformulate this equation as a matrix equation which looks like:

\[\begin{bmatrix} DM_A^1 & DM_B^1 & ... & DM_F^7\end{bmatrix} \begin{bmatrix} W_1 \\ W_2 \\ \vdots \\ W_{42} \end{bmatrix}= \begin{bmatrix} energy_{DW} \\ loudness_{DW} \\ \vdots \\ B_{DW} \end{bmatrix}\]

Where each \(DM\) is a vector of all the features. To perform a least squares estimate we need to multiply each side by the Transpose of our Daily Mix feature matrix and then solve for the vector of weights:

\[\begin{bmatrix} DM_A^1 & DM_B^1 & ... & DM_F^7\end{bmatrix}^T \times \begin{bmatrix} DM_A^1 & DM_B^1 & ... & DM_F^7\end{bmatrix} \times \begin{bmatrix} W_1 \\ W_2 \\ \vdots \\ W_{42} \end{bmatrix} = \begin{bmatrix} DM_A^1 & DM_B^1 & ... & DM_F^7\end{bmatrix}^T \times DW\]

Calculated weights


However, coming back to reality from our sidestep into linear algebra. Something to be said is that the use of mathematics is nice but it is important to keep it relevant. And negative values really do not make sense in this situation. A negative value would suggest that I intentionally ignored several playlists, which could be the case but I doubt spotify can measure this.

Non negative least squares


Luckily I found a package online called “nnls” (non negative least squares) which calculates the least squares estimate using only positive values.

Unfortunately spotify’s recently played function does not let you see past the last 50 recently played songs so I do not know if these predictions are truly correct. But just looking at the weights do give me some confidence that they are not too bad. I usually listen to about three discover weeklies a day and sometimes I listen to all of them on a day but just a little bit of them all, or one daily mix more than the other. It looks like the weights do represent this.

Something that I do have to add and might be something to look into in the future is that the weights are not distributed equally among all days. As in: some days have a lot more weight to them than others. This could make sense, some days you listen to music less than others. However, all days should be treated equally.

Let’s take a look at how well the weights approximate the discover weekly mix.

What does it mean and how well did it do


As you can see in the table, both least square methods perform pretty good when fitting for the discover weekly playlist.

The sum of squared error for the regular least square method is :

[1] 1.928051e-25

Whilst non negative least square gets:

[1] 5.553706e-29

This came to my suprise because the non negative least square method has more limitations (because it’s not allowed to be negative). This makes me think that my implementation of Least Squares may be flawed. But it also makes me happy that the more sensible approach with more realistic data performs better.